Supervised Learning: Classification

Project #3

Personal Loan Campaign

Craig Drummond

Context

AllLife Bank is a US bank that has a growing customer base. The majority of these customers are liability customers (depositors) with varying sizes of deposits. The number of customers who are also borrowers (asset customers) is quite small, and the bank is interested in expanding this base rapidly to bring in more loan business and in the process, earn more through the interest on loans. In particular, the management wants to explore ways of converting its liability customers to personal loan customers (while retaining them as depositors).

A campaign that the bank ran last year for liability customers showed a healthy conversion rate of over 9% success. This has encouraged the retail marketing department to devise campaigns with better target marketing to increase the success ratio.

You as a Data scientist at AllLife bank have to build a model that will help the marketing department to identify the potential customers who have a higher probability of purchasing the loan.

Objective

  1. To predict whether a liability customer will buy a personal loan or not.
  2. Which variables are most significant.
  3. Which segment of customers should be targeted more.

Conclusions

Recommendations


Detailed Anaysis

Libraries required for this analyis

Data Dictionary

Labels Description
ID Customer ID
Age Customer’s age in completed years
Experience years of professional experience
Income Annual income of the customer (in thousand dollars)
ZIP Code Home Address ZIP code.
Family the Family size of the customer
CCAvg Average spending on credit cards per month (in thousand dollars)
Education Education Level. 1: Undergrad; 2: Graduate;3: Advanced/Professional
Mortgage Value of house mortgage if any. (in thousand dollars)
Personal_Loan Did this customer accept the personal loan offered in the last campaign?
Securities_Account Does the customer have securities account with the bank?
CD_Account Does the customer have a certificate of deposit (CD) account with the bank?
Online Do customers use internet banking facilities?
CreditCard Does the customer use a credit card issued by any other Bank (excluding All life Bank)

Imports & Initial Configurations

Load & Initial Evaluation of the Data

Observations

Describe the data

Observations

outliers

Functions Used In This Notebook


Exploratory Data Analysis - EDA

Observations

Observations on age

Observations on income

income outliers

cc_avg outliers

Observations

Observations

Observations

Observations

Observations

Negative values will be converted to the absoute value of the experience, converting the negative to the same positive number

Observations

Distribution of numberical columns

Observations

Observations

Observations

Observations

Observations

Observations

Observations

Observations


Bivariate Analysis

Observations

Observations

Observations

Observations

Observations

Observations

Observations

Observations

Observations

Observations


Build, Train, and Evaluate a Model

Functions that can now be initialized

Observations

Recall score from initial model

Visualize the Decision Tree of the initial model

Feature Importance

GridSearch to determine the best combination of hyperparameters for our next model

Confusion Matrix using GridSearchCV trained data

Recall Score using GridSearchCV trained data

Visualizing the decision tree using GridSearchCV trained data

Feature Importance using GridSearchCV

Cost Complexity Pruning

Visualizing the final Decision Tree

Decision tree model with post pruning has given the best recall score